end-to-end model
Limitations
While our study identifies clear separations between model hypothesis classes, our best models still have not reached the consistency ceiling of the neural and behavioral benchmarks we have compared against. All models were simultaneously trained across all eight scenarios of the Physion Dynamics Training Set, constituting around 16,000 total training scenarios (2,000 scenes per scenario) [Bear et al., 2021], with a Each C-SWM [Kipf et al., 2020] model was trained on For each stimulus, we compute the proportion of "hit" responses by The Correlation to A verage Human Response is the Pearson's correlation between the model probability-hit vector and the human proportion-hit vector, across stimuli per scenario. OCP Accuracy of humans and models is the average accuracy, across stimuli per scenario. To give the final values of the two quantities, we then compute the weighted mean and s.e.m. of the above per Note that these values are therefore different for each condition, but always the same across all models. All neural predictivities are reported on heldout conditions and their timepoints.
Controlling Steering Angle for Cooperative Self-driving Vehicles utilizing CNN and LSTM-based Deep Networks
Valiente, Rodolfo, Zaman, Mahdi, Ozer, Sedat, Fallah, Yaser P.
A fundamental challenge in autonomous vehicles is adjusting the steering angle at different road conditions. Recent state-of-the-art solutions addressing this challenge include deep learning techniques as they provide end-to-end solution to predict steering angles directly from the raw input images with higher accuracy. Most of these works ignore the temporal dependencies between the image frames. In this paper, we tackle the problem of utilizing multiple sets of images shared between two autonomous vehicles to improve the accuracy of controlling the steering angle by considering the temporal dependencies between the image frames. This problem has not been studied in the literature widely. We present and study a new deep architecture to predict the steering angle automatically by using Long-Short-Term-Memory (LSTM) in our deep architecture. Our deep architecture is an end-to-end network that utilizes CNN, LSTM and fully connected (FC) layers and it uses both present and futures images (shared by a vehicle ahead via Vehicle-to-Vehicle (V2V) communication) as input to control the steering angle. Our model demonstrates the lowest error when compared to the other existing approaches in the literature.
- North America > United States > Florida > Orange County > Orlando (0.14)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Transportation > Ground > Road (0.68)
- Automobiles & Trucks (0.68)
- Information Technology (0.47)
TeluguST-46: A Benchmark Corpus and Comprehensive Evaluation for Telugu-English Speech Translation
Akkiraju, Bhavana, Bandarupalli, Srihari, Sambangi, Swathi, Ravuri, Vasavi, Saraswathi, R Vijaya, Vuppala, Anil Kumar
Despite Telugu being spoken by over 80 million people, speech translation research for this morphologically rich language remains severely underexplored. We address this gap by developing a high-quality Telugu--English speech translation benchmark from 46 hours of manually verified CSTD corpus data (30h/8h/8h train/dev/test split). Our systematic comparison of cascaded versus end-to-end architectures shows that while IndicWhisper + IndicMT achieves the highest performance due to extensive Telugu-specific training data, finetuned SeamlessM4T models demonstrate remarkable competitiveness despite using significantly less Telugu-specific training data. This finding suggests that with careful hyperparameter tuning and sufficient parallel data (potentially less than 100 hours), end-to-end systems can achieve performance comparable to cascaded approaches in low-resource settings. Our metric reliability study evaluating BLEU, METEOR, ChrF++, ROUGE-L, TER, and BERTScore against human judgments reveals that traditional metrics provide better quality discrimination than BERTScore for Telugu--English translation. The work delivers three key contributions: a reproducible Telugu--English benchmark, empirical evidence of competitive end-to-end performance potential in low-resource scenarios, and practical guidance for automatic evaluation in morphologically complex language pairs.
- Europe > Austria > Vienna (0.14)
- Asia > India > Telangana > Hyderabad (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (5 more...)
SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models
Wan, Zhen, Yang, Chao-Han Huck, Yu, Yahan, Tian, Jinchuan, Li, Sheng, Hu, Ke, Chen, Zhehuai, Watanabe, Shinji, Cheng, Fei, Chu, Chenhui, Kurohashi, Sadao
We introduce Speech-based Intelligence Quotient (SIQ) as a new form of human cognition-inspired evaluation pipeline for voice understanding large language models, LLM Voice, designed to assess their voice understanding ability. Moving beyond popular voice understanding metrics such as word error rate (WER), SIQ examines LLM Voice across three cognitive levels motivated by Bloom's Taxonomy: (1) Remembering (i.e., WER for verbatim accuracy); (2) Understanding (i.e., similarity of LLM's interpretations); and (3) Application (i.e., QA accuracy for simulating downstream tasks). We demonstrate that SIQ not only quantifies voice understanding abilities but also provides unified comparisons between cascaded methods (e.g., ASR LLM) and end-to-end models, identifies annotation errors in existing benchmarks, and detects hallucinations in LLM Voice. Our framework represents a first-of-its-kind intelligence examination that bridges cognitive principles with voice-oriented benchmarks, while exposing overlooked challenges in multi-modal training. Our code and data will be open source to encourage future studies.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- (9 more...)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (5 more...)
Evaluating Foundation Models with Pathological Concept Learning for Kidney Cancer
Gao, Shangqi, Wang, Sihan, Gao, Yibo, Wang, Boming, Zhuang, Xiahai, Warren, Anne, Stewart, Grant, Jones, James, Crispin-Ortuzar, Mireia
To evaluate the translational capabilities of foundation models, we develop a pathological concept learning approach focused on kidney cancer. By leveraging TNM staging guidelines and pathology reports, we build comprehensive pathological concepts for kidney cancer. Then, we extract deep features from whole slide images using foundation models, construct pathological graphs to capture spatial correlations, and trained graph neural networks to identify these concepts. Finally, we demonstrate the effectiveness of this approach in kidney cancer survival analysis, highlighting its explainability and fairness in identifying low- and high-risk patients. The source code has been released by https://github.com/shangqigao/RadioPath.
- North America > United States (0.47)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > Greece > Attica > Athens (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Health & Medicine > Therapeutic Area > Oncology > Kidney Cancer (1.00)
- Health & Medicine > Therapeutic Area > Nephrology (1.00)
- Government > Regional Government > North America Government > United States Government > FDA (0.47)
- North America > Canada > Ontario > Toronto (0.14)
- North America > Canada > Ontario > Hamilton (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
UniSTPA: A Safety Analysis Framework for End-to-End Autonomous Driving
Kou, Hongrui, Lyu, Zhouhang, Wang, Ziyu, Wang, Cheng, Zhang, Yuxin
As autonomous driving technology continues to advance, end-to-end models have attracted considerable attention owing to their superior generalisation capability. Nevertheless, such learning-based systems entail numerous safety risks throughout development and on-road deployment, and existing safety-analysis methods struggle to identify these risks comprehensively. To address this gap, we propose the Unified System Theoretic Process Analysis (UniSTPA) framework, which extends the scope of STPA from the operational phase to the entire lifecycle of an end-to-end autonomous driving system, including information gathering, data preparation, closed loop training, verification, and deployment. UniSTPA performs hazard analysis not only at the component level but also within the model's internal layers, thereby enabling fine-grained assessment of inter and intra module interactions. Using a highway Navigate on Autopilot function as a case study, UniSTPA uncovers multi-stage hazards overlooked by conventional approaches including scene design defects, sensor fusion biases, and internal model flaws, through multi-level causal analysis, traces these hazards to deeper issues such as data quality, network architecture, and optimisation objectives. The analysis result are used to construct a safety monitoring and safety response mechanism that supports continuous improvement from hazard identification to system optimisation. The proposed framework thus offers both theoretical and practical guidance for the safe development and deployment of end-to-end autonomous driving systems.
- Europe > France (0.04)
- North America > United States > Massachusetts (0.04)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
- (2 more...)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (1.00)
- Automobiles & Trucks (1.00)
FLARE: Feature-based Lightweight Aggregation for Robust Evaluation of IoT Intrusion Detection
Boswell, Bradley, Barrett, Seth, Rajaganapathy, Swarnamugi, Dorai, Gokila
The proliferation of Internet of Things (IoT) devices has expanded the attack surface, necessitating efficient intrusion detection systems (IDSs) for network protection. This paper presents FLARE, a feature-based lightweight aggregation for robust evaluation of IoT intrusion detection to address the challenges of securing IoT environments through feature aggregation techniques. FLARE utilizes a multilayered processing approach, incorporating session, flow, and time-based sliding-window data aggregation to analyze network behavior and capture vital features from IoT network traffic data. We perform extensive evaluations on IoT data generated from our laboratory experimental setup to assess the effectiveness of the proposed aggregation technique. To classify attacks in IoT IDS, we employ four supervised learning models and two deep learning models. We validate the performance of these models in terms of accuracy, precision, recall, and F1-score. Our results reveal that incorporating the FLARE aggregation technique as a foundational step in feature engineering, helps lay a structured representation, and enhances the performance of complex end-to-end models, making it a crucial step in IoT IDS pipeline. Our findings highlight the potential of FLARE as a valuable technique to improve performance and reduce computational costs of end-to-end IDS implementations, thereby fostering more robust IoT intrusion detection systems.
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- North America > United States > Georgia > Richmond County > Augusta (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)